Novel Techniques to Reduce Search Space in Periodic-Frequent Pattern Mining
نویسندگان
چکیده
Periodic-frequent patterns are an important class of regularities that exist in a transactional database. Informally, a frequent pattern is said to be periodic-frequent if it appears at a regular interval specified by the user (i.e., periodically) in a database. A pattern-growth algorithm, called PFP-growth, has been proposed in the literature to discover the patterns. This algorithm constructs a tid-list for a pattern and performs a complete search on the tid-list to determine whether the corresponding pattern is a periodic-frequent or a non-periodic-frequent pattern. In very large databases, the tid-list of a pattern can be very long. As a result, the task of performing a complete search over a pattern’s tid-list can make the pattern mining a computationally expensive process. In this paper, we have made an effort to reduce the computational cost of mining the patterns. In particular, we apply greedy search on a pattern’s tid-list to determine the periodic interestingness of a pattern. The usage of greedy search facilitate us to prune the non-periodic-frequent patterns with a sub-optimal solution, while finds the periodic-frequent patterns with the global optimal solution. Thus, reducing the computational cost of mining the patterns without missing any knowledge pertaining to the periodic-frequent patterns. We introduce two novel pruning techniques, and extend them to improve the performance of PFP-growth. We call the algorithm as PFP-growth++. Experimental results show that PFPgrowth++ is runtime efficient and highly scalable as well.
منابع مشابه
Discovering partial periodic-frequent patterns in a transactional database
Time and frequency are two important dimensions to determine the interestingness of a pattern in a database. Periodic-frequent patterns are an important class of regularities that exist in a database with respect to these two dimensions. Current studies on periodic-frequent pattern mining have focused on discovering full periodic-frequent patterns, i.e., finding all frequent patterns that have ...
متن کاملMax-FTP: Mining Maximal Fault-Tolerant Frequent Patterns from Databases
Mining Fault-Tolerant (FT) Frequent Patterns in real world (dirty) databases is considered to be a fruitful direction for future data mining research. In last couple of years a number of different algorithms have been proposed on the basis of Apriori-FT frequent pattern mining concept. The main limitation of these existing FT frequent pattern mining algorithms is that, they try to find all FT f...
متن کاملDiscovering Periodic-Frequent Patterns in Transactional Databases
Since mining frequent patterns from transactional databases involves an exponential mining space and generates a huge number of patterns, efficient discovery of user-interest-based frequent pattern set becomes the first priority for a mining algorithm. In many real-world scenarios it is often sufficient to mine a small interesting representative subset of frequent patterns. Temporal periodicity...
متن کاملEfficient Maximal Frequent Itemset Mining by Pattern - Aware Dynamic Scheduling
While frequent pattern mining is fundamental for many data mining tasks, mining maximal frequent itemsets efficiently is important in both theory and applications of frequent itemset mining. The fundamental challenge is how to search a large space of item combinations. Most of the existing methods search an enumeration tree of item combinations in a depthfirst manner. In this thesis, we develop...
متن کاملFUZZY GRAVITATIONAL SEARCH ALGORITHM AN APPROACH FOR DATA MINING
The concept of intelligently controlling the search process of gravitational search algorithm (GSA) is introduced to develop a novel data mining technique. The proposed method is called fuzzy GSA miner (FGSA-miner). At first a fuzzy controller is designed for adaptively controlling the gravitational coefficient and the number of effective objects, as two important parameters which play major ro...
متن کامل